Inference 

as 

prediction 

Jane Watson 

University of Tasmania 
<jane.watson@utas.edu.au> 


I nference, or decision making, is seen in curriculum documents as 
the final step in a statistical investigation (Australian Education 
Council, 1991). For a formal statistical enquiry this may be associ- 
ated with sophisticated tests involving probability distributions. For 
young students without the mathematical background to perform 
such tests, it is still possible to draw informal inferences based on 
data of various sorts, for example by comparing two graphical repre- 
sentations (e.g., Watson & Moritz, 1999). In doing so it is important 
to be able to state the assumptions that are the foundation for the 
decision made (Whitin, 2006). This article considers a straightfor- 
ward context where students are asked to make predictions. These 
predictions are informal inferences that can be based on aspects of 
the scenario, the students’ appreciation of the context, and their 
cognisance of the data presented. 

Making predictions in the face of incomplete information can be 
hazardous. One of the goals of the statistics curriculum is to assist 
students in making predictions that have a high probability of 
turning out to be correct, or at least to assist students in being able 
to judge what the likelihood of being correct is. In this situation there 
is usually a risk of being wrong and some children (and adults) are 
not risk-takers. Many aspects of the mathematics curriculum also 
reinforce a view against making predictions without certainty. Much 
of the problem solving carried out in the classroom for example 
results in one correct answer to a problem. Proof is also put forward 
as a way of ensuring that conclusions are true. In the world outside 
the classroom, however, students often are required to make deci- 
sions where several alternatives may appear reasonable. 

As educators, beyond being statistics or mathematics educators, 
we want students to become critical thinkers who, when asked to 
make a prediction, can do so by taking into account all of the 
perspectives available in the context where the prediction is to be 
made. We do not necessarily want them to ignore all information 
besides that which might be supplied by a statistic. 


amt 63 (1) 2007 


In order to assist in understanding the development that takes 
plaee during the years of sehooling in students’ willingness to make 
inferenees, a small study is presented where 30 students in grades 
3, 5, 7 and 9 were asked to make a predietion. Three years later, 22 
of them were asked to make the same predietion. The range of 
responses aeross ages and over time shed light on 
what teaehers ean expeet in their elassrooms. In this 
ease there is also an indieation that the sehool, a 
private girls’ sehool in an Australian eapital eity, may 
have adapted its eurrieulum in the intervening years. 

The author was not involved with the sehool exeept in 
relation to the researeh earned out. 

The eontext of the predietion question was an inter- 
view that was part of researeh into students’ 
understanding of eoneepts in the ehanee and data 
eurrieulum (Watson & Moritz, 2001). Students were 
presented with a task to ereate a pietograph repre- 
senting the number of books eaeh member of a small 
group of ehildren had read. To do so, they had small 
eards that were drawings of named ehildren and single 
books, as shown in Figure 1. All students eould eount 
the books aeeurately and ereate a representation, 
although the form of the presentation varied. Figure 2 
shows typieal representations as ereated by the 
students; not all students worked with exaetly the same pietograph. 

There was an indieation in the information presented to the students 
through the protoeol that in total the girls had read more books than 
the boys (for 2 girls and 2 boys, the totals were 10 and 4 respeetively; 
for 3 girls and 4 boys the totals were 14 and 9 respeetively). After the 
representation phase of the interview was eompleted, students were 
introdueed to a new student, named Helen, who had eome to the 
elass, and asked to prediet how many books Helen might have read. 
Following this another new student, 

Paul, was introdueed and the same 
question asked. About half of the time 
Helen was introdueed first and half of 
the time, Paul. Stop now and think how 
you would respond to this question. 

Based on the responses, 30 at the 

beginning and 22 at the end of the 

three-year period, nine different 

elements, or eomponents were observed 

in the responses. Due to the nature of 

the question, responses eould not be 

judged as “eorreet” or “ineorreet”; there 

was no right answer. Some elements of 

responses, however, eould be judged as 

more statistieally appropriate than 

others. The most sophistieated 

, 1 , Figure 2. Two typical representations 

responses might, tor example, melude created by students 




Anne 



Figure 1 , Cards used by 
students to create a 
pietograph. 


amt 63 (1) 2007 


7 


several types of eomponents, even eontrasting statistieal and non- 
statistieal elements. 

Examples of eaeh of the elements of responses are presented with 
the grade level indieated. Grades 6, 8, 10, and 12 responses are from 
the seeond interviews. Here all of the responses are from different 
students. In plaees the interviewer’s questions have been shortened 
to save spaee. 

Four elements of responses were non-statistieal in nature. These 
usually oeeurred in isolation. First was the refusal to make any 
predietion, most often without a reason. 

S 1 : [How many did Paul read?] I couldn’t tell because I haven’t been 

given the information. [Grade 5] 

In the seeond ease some students made guesses with no justifiea- 
tion. 

S2: [Paul?] About 3. [What makes you say that?] Nothing much. 

[Just a guess?] Yes. [Grade 5] 

Other responses used of a third element of made-up stories. 

S3: [What about if Paul came along. Can you tell how many books 

Paul might have read?] Umm [pause] Maybe 2. [Why do you say 
2?] Because he might not like reading so he might think that 
reading is a bit boring so that is why he would have read about 
2 books. [What do you think about Helen, do you think she 
would have read a few?] She could have read about 3 because 
she might like reading, but then she wanted to get this really 
good series that her friends might have said are really good, but 
she thought it might have been 10 or 12 books, but then she 
realised there was only 3 so she Just got that set. [Grade 3] 

Finally a few students used a fourth strategy of seleeting values 
based on a pattern or a gap in their data displays. 

S4: [Paul?] He could have read 2 or 5. [What makes you say that?] 

Because we are missing numbers all the way along. [Grade 9] 

Five elements of responses were statistieal in nature. One element 
was ‘referenee to the range of values in the representation,’ often 
used in eonjunetion with other statistieal ideas. The following 
response is an isolated referenee to the range. 

S5: [Paul?] Between, maybe 1 and 6. [What makes you say that?] 

Beeause 1 is the ... [least], or 0, and 6, because he could have 
read no books or he eould have read 6 books, because that is 
sort of the highest. [Grade 9] 


A second element focusing on the mode, referred to as ‘most,’ was 
infrequently observed. 

S6: [Helen?] I’d say 5 ... because two people read that amount. 

[Paul?] Five books again. [Beeause it is the most?] Yes. [Grade 6] 

Two elements of responses were associated with centres. Some of 
the descriptions were based on the ‘intuitive idea of middle.’ These 


8 


amt 63 (1) 2007 


general referenees to middle were more likely to be assoeiated with 
other statistieal deseriptions, as shown in the next two extraets, than 
referenees based on the mean. 

S7: [Helen?] About 5. [What makes you say that?] Well the girls 

seem to enjoy reading the most and basieally nearly everybody’s 
read around 5 or 4 exeept a few people so that indicates that she 
might like reading as much as they would. [Paul?] Probably 
about 3 or 4 because all the boys except Andrew don't seem to 
like reading as much, and so yeah. [How do you decide 3 or 4 
particularly?] ... Because Ian only read 1, Danny 2 and Terry 4, 
with the exception of Andrew who read 6 so most of the boys, 
the majority read less than the girls and Andrew. [Grade 8] 

S8: [Helen?] I say about 5 because it seems to be around the middle 

of how many books people have borrowed because the smallest 
is about 1 or 2 and the biggest is about 6 or 7, so it would be 
about 4 or 5 I’d say. [Paul?] I say he read 3 books because well 
just from seeing the boys, assume to, like that’s the average of 
how many the boys borrowed. [So you think it might be different 
for the boys because they seem to have read less?] Mmm. [OK, 
and how did you determine that it was three?] Um, well I sort of 
had a look at Andrew and Terry and Danny and Ian and just 
took a guess of about how many there would be, like how many 
books Paul had read. [Grade 10] 

The other element related to eentres was speeifieaUy related to the 
‘mean,’ with little interest shown in other features. 

S9: [Helen?] Um, about five or something. [OK, what makes you say 

that ... five?] Um, maybe a bit less, actually four or something. 
Because whatever ... you add them up and do the average or 
something like that ... add them together equals 30 and divide 
them by 7 ... [student uses calculator] Um, 4 point whatever. 
[Paul?] It would be different, you’d have to add four on so it 
would be 34 and then divide by 8. [Grade 8] 

The fifth statistical element was contained in responses that 
‘distinguished between the boys and girls.’ No student talked about 
the difference between boys and girls without also mentioning an 
idea associated with middle. Examples are the responses S7 and S8. 
A few students included some contextual assumptions alongside 
their statistically-based predictions. 

SIO: [Paul?] Well it depends, if he's around the kind of reader that 
they are and that if he has around the same reading average 
then we'd add all of these books up and divide them by 4 to find 
the average. And that would be about how many. [OK, shall we 
do that?] [counts aloud] 6, 9, 10, 14 divided by 4 is 3.5 [OK so 
you'd expect him to have read about that many?] Yes, 3 and a 
half books. [Helen?] Um well if she came along at the same time 
then no; but if she came along after Paul then it might make a 
difference. [Right.] Because, um ... Oh no, probably wouldn't 
because he's just got an average of it and then it would just be 
divided by the same numbers and so it would still be 3 and a 
half for Helen. [Grade 7] 


amt 63 (1) 2007 


9 


In the initial interviews, 25 out of 30 responses were non-statistieal 
in nature, ineludtng all of the Grade 3 and Grade 5 responses. Two 
Grade 7 and 3 Grade 9 responses were statistieal in nature. Three 
years later, 19 out of 22 responses were statistieal in nature. The 
three non-statistieal responses were from students now in Grade 6. 

It would seem unlikely that the improvement was only due to 
maturation. The pereentages of statistieal responses ordered by 
grade, not year eolleeted, are shown in Table 1. The pereentages of 
Grade 7 and 9 statistieal responses in Year 1 are lower than the 
Grade 6 and 8 pereentages in Year 4, even though the students were 
on average at least a year older at the time of the interviews. It would 
appear that in the intervening three years between interviews the 
sehool foeussed on statistieal reasoning aeross the upper primary 
and middle sehool years. 


Table 1 , Percentage at statistical responses ordered by grade. 


Grade 3 
(Year 1 ) 

Grade 5 
(Year 1) 

Grade 6 
(Year 4) 

Grade 7 
(Year 1) 

Grade 8 
(Year 4) 

Grade 9 
(Year 1) 

Grade 10 
(Year 4) 

Grade 12 
(Year 4) 

0% 

0% 

40% 

25% 

100% 

37% 

100% 

100% 


A pietograph representing books read is a relatively simple 
eontext within whieh to ask students to make predietions. It eould 
be used in a elassroom to eneourage diseussion about the eriteria to 
be used in estimation. From a teaehing standpoint, how ean these 
responses be used in the elassroom to ereate a eulture of ineluding 
as many eomponents as possible in reaehing a sound predietion? 
Obviously it is important to aeknowledge that “Paul and Helen are 
not here to teU us how many books they read,” so “we eannot be 
eertain of our predietion.” That faet, however, should not stop the 
elass from making the best predietion possible based on all of the 
information available. It is interesting that some of the interviewed 
students who knew how to ealeulate the mean apparently did not 
see the need to eonsider any of the other information available, for 
example, gender. Others who were less preeise about “middles” took 
into aeeount other faeets of the data and eontext. This is eneour- 
aging and should be applauded. Another interesting point to diseuss 
would be SlO’s predietion that Paul and Helen eould have read 3 _ 
books. The elass eould diseuss the meaning of this in the eontext. In 
some eontexts 3 _ might have little meaning but here it is possible 
to suggest reading half a book. What does the elass think of this 
predietion? Also of interest is SlO’s presentation of some assump- 
tions about Paul. Other students would likely be able to suggest 
their own assumptions that eould influenee the predietion. A elass 
diseussion should ineorporate as many of these suggestions as 
possible and it is likely that different elasses will arrive at different 
“best” predietions. Students eould be asked to write individual 
summaries after a elass diseussion to express their final prefer- 


10 


amt 63 (1) 2007 


ences, again realising that there are no “eorreet” predietions, just 
some that are more statistieally appropriate than others. 

In the elassroom it is important to eonsider the full range of 
responses, from that of not making any predietion beeause all the 
information is not available to that of making thoughtful predietions 
that reeognise statistieal eharaeteristies of the data and aeknowledge 
uneertatnty. All responses should be handled earefully by the 
teaeher, valuing partieularly those responses that eontain several 
elements and promoting a refleetive eonsideration of them. Just 
telling students to “ealeulate the mean number of books read” or 
aeeepting this response without further diseussion does not lead to 
an appropriate beginning for the appreeiation of what statistieal 
inferenee is about. 


Acknowledgement 

The original researeh for this artiele was funded by the Australian 
Researeh Couneil and the author thanks an anonymous referee for 
helpful editing suggestions. Jonathan Moritz eondueted the inter- 
views. 


References 

Australian Education Council (1991). A National Statement on Mathematics for Australian 
Schools. Melbourne: Author. 

Watson, J. M. & Moritz, J. B. (1999). The beginning of statistical Inference: Comparing two 
data sets. Educational Studies in Mathematics. 37, 145-168. 

Watson, J. M. & Moritz, J. B. (2001). Development of reasoning associated with plctographs: 
representing. Interpreting, and predicting. Educational Studies in Mathematics, 48, 47-8 1 . 

Whltln, D. J. (2006). Learning to talk back to a statistic. In G. F. Burrlll (Ed.), Thinking and 
Reasoning with Data and Chance (pp. 31-39). Reston, VA: National Council of Teachers of 
Mathematics. 


investigation ideas 


► If you have timber of two different iengths, how many different 
reetanguiar pioture frames oan be made? What if you had timber of 
three different iengths, four different iengths, or n different iengths? 


► A eaiouiator has the 6 and 5 key broken. 

How many different ways oan you perform the oaieuiation 65 + 56? 
What about 65 - 56 or 65 x 56? 


► Is it the ease that the eube of any number 
oan be written as the sum of odd numbers? 


amt 63 (1) 2007 


11 


